Skip to content

Arm backend: Minor composable_quantizer improvements#19330

Open
AdrianLundell wants to merge 1 commit intopytorch:mainfrom
AdrianLundell:change-1253792
Open

Arm backend: Minor composable_quantizer improvements#19330
AdrianLundell wants to merge 1 commit intopytorch:mainfrom
AdrianLundell:change-1253792

Conversation

@AdrianLundell
Copy link
Copy Markdown
Collaborator

@AdrianLundell AdrianLundell commented May 6, 2026

  • Make sure the root node of a shared cluster is the topological first node of the model to avoid crash in torchao.

  • Do not report get_attr as non-annotated as these are never quantized, leading to unnecessarily long logs.

  • Make SharedQuantization logging more straight to the point. Do not warn for multiple users as this is resported in the quantizer report and use one single simple log message when nodes are left unquantized.

  • Make the pre-transform for annotation report more minimal and only print it when relevant + update the notebook example to this fact.

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

- Make sure the root node of a shared cluster is the topological
  first node of the model to avoid crash in torchao.

- Do not report get_attr as non-annotated as these are never quantized,
  leading to unnecessarily long logs.

- Make SharedQuantization logging more straight to the point. Do not
  warn for multiple users as this is resported in the quantizer report
  and use one single simple log message when nodes are left unquantized.

- Make the pre-transform for annotation report more minimal and only
  print it when relevant + update the notebook example to this fact.

Signed-off-by: Adrian Lundell <adrian.lundell@arm.com>
Change-Id: I412918235799c99ab4b1b1e1f6412de4c906766f
Copilot AI review requested due to automatic review settings May 6, 2026 12:25
@AdrianLundell AdrianLundell added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk release notes: none Do not include this in the release notes labels May 6, 2026
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 6, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19330

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (10 Unrelated Failures)

As of commit e712c31 with merge base 1debeb6 (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 6, 2026
@github-actions github-actions Bot added the module: arm Issues related to arm backend label May 6, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refines Arm backend composable quantizer reporting and SharedQspecQuantizer behavior to reduce noisy logs and avoid a torchao crash by stabilizing shared-cluster root selection.

Changes:

  • Update pre-transform reporting to be more minimal and only emitted when relevant; update the Arm quantizer tutorial accordingly.
  • Exclude get_attr nodes from “non-annotated” reporting to avoid misleading/verbose reports.
  • Ensure SharedQspecQuantizer uses the topologically-first node as the shared-cluster root and simplify its accept/reject reporting.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
examples/arm/quantizer_tutorial.ipynb Updates documentation to explain the two reports (pre-transform vs final) and aligns example output formatting.
backends/cortex_m/quantizer_reporter.py Filters get_attr out of unannotated-node reporting to reduce unnecessary noise.
backends/arm/quantizer/arm_quantizer.py Replaces the pre-transform quantizer reporter output with a minimal debug log of nodes not marked for decomposition.
backends/arm/quantizer/arm_quantizer_utils.py Stabilizes shared-cluster root selection and simplifies SharedQuantization reject handling/logging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1119 to +1127
msg = """
----------------------------------------------------------------------------------------------------
PRE-TRANSFORM FOR ANNOTATION QUANTIZATION REPORT
----------------------------------------------------------------------------------------------------
The following nodes are not marked for quantization and will not be decomposed in the transform for annotation pipeline:\n"""
for node in non_quantized_nodes:
msg += f" {node.name}\n"

logger.debug(msg)
Comment on lines +625 to +628
self.report_reject(
ordered_nodes,
"All inputs and outputs to these nodes are non-quantized.",
)
Comment on lines +595 to +596
# Ensure the root node is the first one in the graph.
root_node = ordered_nodes[0]
"decomposition must not happen if it should be kept in float.\n",
"\n",
"**This is important to be aware of when doing mixed quantization since this means that for an operator to be fully quantized,\n",
"both the original operator and the decomposition needs to be targeted.**\n",
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: arm Issues related to arm backend partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: none Do not include this in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants